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Abstract 

The design of algorithms on complex networks, such as routing, ranking or recommendation 
algorithms, requires a detailed understanding of the growth characteristics of the networks of 
interest, such as the Internet, the web graph, social networks or online communities. To this end, 
preferential attachment, in which the popularity (or relevance) of a node is determined by its 
degree, is a well-known and appealing random graph model, whose predictions are in accordance 
with experiments on the web graph and several social networks. However, its central assumption, 
that the popularity of the nodes depends only on their degree, is not a realistic one, since every 
node has potentially some intrinsic quality which can differentiate its attractiveness from other 
nodes with similar degrees. 

In this paper, we provide a rigorous analysis of preferential attachment with fitness, suggested 
by Bianconi and Barabasi and studied by Motwani and Xu, in which the degree of a vertex is 
scaled by its quality to determine its attractiveness. Including quality considerations in the clas- 
sical preferential attachment model provides a much more realistic description of many complex 
networks, such as the web graph, and allows to observe a much richer behavior in the growth 
dynamics of these networks. Specifically, depending on the shape of the distribution from which 
the qualities of the vertices are drawn, we observe three distinct phases, namely a first-mover- 
advantage phase, a, fit- get- richer phase and an innovation-pay s- off ph.as,e. We precisely characterize 
the properties of the quality distribution that result in each of these phases and we compute the 
exact growth dynamics for each phase. The dynamics provide rich information about the quality 
of the vertices, which can be very useful in many practical contexts, including ranking algorithms 
for the web, recommendation algorithms, as well as the study of social networks. Furthermore, 
the mathematical techniques we introduce to establish these dynamics could be applicable to a 
wide variety of problems. 
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1 Introduction 



In recent years, there has been a convergence of ideas coming from computer science, social sciences 
and economic sciences as researchers in these fields attempt to model and analyze the characteristics 
and dynamics of large complex networks, such as the web graph, social networks and recommendation 
networks. From the computational perspective, it has been recognized that the successful design of 
algorithms performed on such networks, including routing, ranking and recommendation algorithms, 
must take into account the social dynamics as well as the technical properties and economic incentives 
that govern network growth [221 [23l [T5] . 

Random Graph Models. An appealing way to model the growth dynamics of these networks is 
via random graph models. The well-studied Erdos-Renyi model is not an appropriate description of 
these networks, because it is a static rather than dynamic model, and more importantly, because 
sparse graphs drawn from the Erdos-Renyi model have Poisson degree distributions rather than the 
scale- free (power- law) distributions observed in a variety of social phenomena [26], and verified by 
experiments on the World Wide Web [21 [HI [16] — the latter seen as a massive graph with web pages 
being its vertices and directed edges between vertices corresponding to hyperlinks from one page to 
another. 

Several models have been suggested which result in scale-free distributions, probably the first 
being due to Yule [25] and Simon [2l]. In the context of scientific citations power law distributions 
were observed by Lotka [19], and Gilbert [13] specifies a probabilistic model supporting Lotka's 
law. Kleinberg et al. [16] and Kumar et al. [H] suggest and study the copy model which captures 
the power law distribution and other connectivity properties of the World Wide Web, while other 
models include works from Broder et al. [8], Cooper and Frieze [9], Drinea et al. [11], Krapivsky and 
Redner ^7\. 

Preferential Attachment Models. One of the most natural and attractive models for network 
growth is the preferential attachment model, suggested by Barabasi and Albert [2] to model the 
web graph, and originally proposed as the cumulative advantage model by Derek de Solla Price in 
1965 [To]. See e.g. [3 [6] for a rigorous treatment. Roughly speaking, as time evolves, new vertices join 
the network by adding several links to the vertices already present in the network in a probabilistic 
fashion. The probability of linking to an existing vertex is an increasing function, usually polynomial, 
in its degree, which captures the intuitive fact that higher degree of a vertex reflects higher relevance 
or popularity. 

This model by itself has been rather successful in predicting the graph structure of the web [2], 
at least as an undirected graph. Nevertheless, there is an unsatisfactory assumption underlying the 
model. The popularity of a vertex depends only on its degree. As a result, the prediction of the model 
is the so-called first-mover-advantage phenomenon in which earlier vertices tend to have significantly 
higher degrees than later ones, making it hard for a vertex which enters late to compete with the 
already established hubs of the network. Moreover, the model is completely symmetric with respect 
to vertices which enter at similar times, since there is no modeling of how the intrinsic quality of 
every vertex affects its growth in the network. How is the quality of vertices reflected in the network 
structure and its dynamics? How can one extract such information? 

To answer this type of questions we analyze a variant of the preferential attachment model which 
explicitly models the intrinsic quality of the vertices. This model, introduced in the context of the 
web by Bianconi and Barabasi [1] , is usually called preferential attachment with fitness. In this model, 
when a new vertex is created, it gets assigned a quality parameter, henceforth called fitness, drawn 
from a given distribution, which scales its degree to determine its attractiveness in the evolution of 
the network. The resulting model provides a much more accurate description of many real-world 
networks [3], but it is also more difficult to analyze rigorously; see Bianconi and Barabasi [3] for 
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heuristic arguments and Motwani and Xu [21j for more precise — but nevertheless heuristic in several 
asp ects — arguments . 

Our Results. We provide the first — to our knowledge — rigorous analysis of preferential attachment 
with fitness. We show that, depending on the properties of the distribution from which the fitnesses 
are drawn, henceforth called the fitness distribution, there is a much richer behavior that an evolving 
network may exhibit than what is predicted by the classical preferential attachment model. We 
precisely characterize the possible evolutions of a complex network and we specify the properties of 
the fitness distribution resulting in each of them. More precisely, we show that, depending on the 
fitness distribution, an evolving network can undergo one of the following behaviors, or phases: 

• the first-mover-advantage phase, which results from flat fitness distributions and corresponds 
to the power-law behavior predicted by the classical preferential attachment model; 

• the fit-get-richer phase, in which vertices of higher fitness grow faster than those of smaller fit- 
ness; the behavior here is a power-law within each fitness value, but the tail exponent decreases 
as the fitness increases; 

• the innovation-pays-ojf phase, in which roughly speaking the competition for links results in 
a constant fraction of the links continuously shifting to ever larger fitness values; this fraction 
of links that "escapes to infinity" is independent of the network size and is determined by the 
fitness distribution; such behavior is not observed in the fit-get-richer phase. 

Our analysis is applicable to both discrete and continuous fitness distributions, as well as bounded 
or unbounded ones, and we provide precise criteria for the fitness distribution that specify which of the 
above phases will arise. In fact, we discover some property of the fitness distribution which exhibits 
a sharp phase transition separating the latter evolution scenarios. Our results are in accordance with 
the predictions of Bianconi and Barabasi [4] derived by mapping the evolving network to a Bose gas 
in the thermodynamic limit. In this terminology, the innovation-pays-off phase corresponds to the 
phenomenon of Bose-Einstein condensation, whereby a constant fraction of the particles condensate 
on the lowest energy level, corresponding in the network context to the supremum of the fitness 
values. 

A by-product of our technique is a precise characterization of the vertex dynamics under prefer- 
ential attachment with fitness. More specifically, if a vertex v has fitness /, then our analysis implies 
that its degree dy[t) at time t scales as 

d,{t) ~ t'f, (1) 

where c is a global constant determined by the fitness distribution. Hence, the logarithm of the 
degree of the vertices directly reflects their quality. This could suggest new directions in the design 
of ranking or recommendation algorithms. 

Proof Techniques. The standard approach to analyze preferential attachment models is to derive 
recursions (or differential equations), typically, of the expected number of nodes of a given degree. 
See e.g. |20j . This type of technique relies crucially on the fact that the number of nodes at any time 
in the graph is deterministic — a quantity that arises as the denominator in the recursion. However, in 
our case, the relevant quantity is the number of nodes weighted by their fitness which, unfortunately, 
is a random variable. This turns out to complicate significantly the analysis. 

To obtain our results, we rely instead on a very different approach, one based on the theory of 
Polya urn models. In Polya's classical urn scheme, an urn contains balls of two colors. At each 
time step, a ball is drawn randomly from the urn and returned along with an extra ball of the same 
color. This is clearly reminiscent of a preferential attachment scheme and the connection between 
the two models has previously been exploited, e.g. in [3] . Here we use a generalized version of Polya's 
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scheme (see e.g. [H]): 1) we consider an arbitrary, but finite number of colors; 2) each bah is picked 
proportionally to a weight, or "activity parameter", associated to its color; and 3) at each time step, 
the ball picked is returned along with a random number of balls of each color, where the distribution 
of this "random update vector" depends on the color of the ball drawn. 

We analyze the limiting behavior of the preferential attachment scheme with fitness by coupling 
the growth process with specially crafted generalized Polya urn models where the colors represent 
connectivity properties of the evolving network, e.g. the cumulative degree of all vertices of a given 
fitness. When the fitness distribution is concentrated on a finite number of atoms, the correspondence 
is somewhat straightforward, although our coupling appears to be novel and it allows to derive 
nontrivial generalizations of classic results very easily. More importantly, we consider in fact general 
fitness distributions, including continuous distributions, which in principle require an infinite number 
of colors in the Polya urn model. Little is known about the behavior of generalized Polya urns 
beyond the finite case, and we resort to various novel truncation techniques to map the dynamics of 
our network to a finite urn process. We expect that our techniques should be useful in a much more 
general context to the analysis of previously unapproachable complex network growth models, which 
now may be analyzed using infinite Polya urn models with techniques analogous to those developed 
here. 

1.1 Definitions and Main Result 

The Model. The generalized preferential attachment model of Bianconi and Barabasi which we 
analyze here is a random graph model defined as follows. 

Definition 1 (Preferential Attachment Scheme with Fitness) Let T C ]R^_ he a set of fit- 
nesses and Q a distribution over fitnesses such that J^dQ(/) = 1. The preferential attachment 
process with fitness begins with one vertex of fitness f ^ T drawn according to Q and a self-loop 
on that vertex. Then, at every time step t, a new vertex is added to the graph, which has fitness 
picked independently according to Q and is attached to an old vertex v with probability proportional 
to fv ■ dy^t-i, where f^ is the fitness of vertex v and d^^t-i its degree at step t — 1. We denote by 
Gn = (YniEn) the graph at time n. We sometimes refer to this process as the {J^, Q)-chain. 

It turns out that the case of unbounded fitnesses is rather uninteresting (see Appendix IC.4P and 
hereon we assume that sup{/ : / G J-} = h for some h < +oo. Furthermore, we consider three 
main cases for J^: either is discrete — finite or countable — with Q strictly positive on JF, or 
is the interval [0,h] and Q admits a strictly positive continuous density on (0,/i). We say that 
(J^, Q) is regular in such cases. Our results extend to more general fitness distributions but we 
restrict ourselves to the regular case here. Also, the process above constructs only undirected trees. 
However, our techniques can be easily extended to directed scale- free graphs as defined in [5]. We 
omit the details. 

Main Result. Our basic result concerns the distribution of links across fitnesses as n — > +cx). Let 
[a, b] C [0, h] with a <h and denote by M„ [^,6] the number of edge endpoints with fitness in [a, b] in 
Gn- Let Ao be the (unique) solution in [h,+oo) of 



if it exists and let Xq = h otherwise. Our main result is the following. 

Theorem 1 (Basic Result) Assume (JF, Q) is regular. Then, for all [a, b] C [0, h) with a < b, we 
have 






dQif) = 2 - Z/[o,a] 
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almost surely as n ^ +00. 

A surprising behavior arises when ([2]) has no solution in [/i, +cxd), or equivalently when T[h) < 1. 
Indeed, in such a case, it is easy to check that i'\f),h~e] ^ 1 + ^(^) < 2 for all e > even though we 
expect lim£_»o i^[o,/i-e] = 2 since for all n, n~^M„ jo,/i] = 2 (i.e. each edge has two endpoints). In other 
words, it appears that a constant fraction of edges is "missing" in the limit. The missing fraction 
actually "escapes to /i" which leads to what we call the innovation-pays-off phase as described above. 
To get a better intuition for the existence of a solution in Q, consider the example Q ~ Beta(a,/3). 
In Example |31of Appendix lC.31 we show there is a solution if and only if /? < a + 1. For a fixed a, a 
large (5 indicates a "fast decay" to at 1 while a small (3 leads to a "fatter tail" around 1. A solution 
to ([2]) exists in the latter case, e.g. in the uniform case. In other words, the innovation-pays-off 
regime requires a more "rarefied" high fitness population. 

Dynamics of the Innovation-Pays-Off Phase. In order to understand (informally) the dynam- 
ics of the innovation-pays-off phase, fix a time t* and let /* be the largest fitness among vertices 
present in the network at time t* . Note that 

• at time t*, the cumulative fraction of the links shared by vertices of fitness up to /* is 2, since 
every edge is accounted for twice; 

• now, consider the network in the limit t = +co; by Theorem [1] and the discussion above, the 
fraction of links shared among vertices of fitness up to /* is at most 1 + I{h)] therefore at least 
a fraction 1 — T{h) of links is shared among vertices of fitness larger than /*, vertices which, 
by definition, were not present at time t* . 

This is the "signature" of the innovation-pays-off phase: a constant fraction of the links changes 
hands toward higher and higher fitness values. 

Power Laws and Vertex Dynamics. In fact, we can prove more than Theorem [TJ As stated 
below in Theorems [3] and [4] and their counterparts in the continuous case, we exhibit power laws for 
the degree distributions on the nodes of a given fitness and we get a tail exponent of Aq/"^ where 
/ is the given fitness. See Section [H Also, as discussed above, we can prove vertex dynamics of the 
form ([1]) . Such result is proved by considering a continuous-time embedding of the process as in [T3] . 
Details are omitted. The constant c in ([T]) is in fact Ag . 

Proof Sketch. As we mentioned before, the basic idea of the proof of Theorem [T] (as well as of 
the power law results in Theorems [3] and H] below) is to couple the preferential attachment process 
with Polya urn models. The first step is the analysis of the case J- finite. There we proceed by 
truncating large degrees and associating a color of a specially designed Polya process to each pair 
(degree, fitness). The limit theory of Polya processes then reduces the problem to an eigenvector 
computation of an appropriately defined matrix (see Section [2]). This computation appears to be 
tricky but turns out to be manageable, as described in Appendix lAl 

The countable and continuous cases are significantly more challenging since Polya urns with 
infinite — whether countable or uncountable — colors are poorly understood. Instead, we use further 
truncation and approximation techniques to couple the infinite cases with finite cases. In Section [H 
we illustrate this idea on the somewhat easier special case oi J- = increasing. There we need 

two finite Polya models — a lower bound and an upper bound — which are obtained by truncating T 
and mapping the remaining fitness values to either or h. The general discrete case as well as the 
continuous case require a much more sophisticated approach which is detailed in Appendix ICl 

Organization of the Paper. We start with a brief overview of generalized Polya urn models 
in Section [2] followed by our treatment of preferential attachment for finite fitness distributions in 
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Section [3l The main steps of the general proof are illustrated in Section [J] in the special case where 
T = {/j}j>i is countable and increasing. Most proofs are relegated to the appendix. Most notably, 
for lack of space the particularly interesting analysis of the continuous case is completely relegated 
to Appendix [Cl 

Notation. We denote by the unit vector along the i-th axis (usually the dimension is clear). 
The notation Is denotes the indicator of the event S. 

2 Generalized Polya Urns 

Our results are obtained through an appropriate mapping of the preferential fitness process to a 
finite generalized Polya urn scheme. We introduce here the basic limit theory of generalized Polya 
urn models keeping our notation consistent with the presentation of Janson [13] , with the exception 
of our matrix A which is the transpose of Janson's, in accordance with common practice in the Polya 
urn literature. 

Definition of the Polya Urn Process. We have q < +00 bins (corresponding to the colors in 
the original Polya model described in the Introduction). Each bin i < q is assigned a fixed activity 
Oj, < Oj < +00. For n > 0, let 

■^n — (^n.l ) • • • ) -^n,q)i 

where X„^j is the number of balls in bin i at time n. The initial load is given by Xq, which may be 
random or deterministic. Each bin, say i, also has a random vector = (^i,i, . . . ,£,i,q) with integer 
coordinates. The process is defined as follows. At time n, we pick one bin. Bin i is chosen with 

(n) 

probability proportional to ajX„„i_j. If bin i is picked, we draw an independent copy Q of and 
update {Xn}n>o according to 

Xn = Xn-l + Cj" • 

Basic Polya Urn Result. The limiting behavior of the Polya Urn process described above can 
be characterized in terms of the qx q matrix A with entries 

Ai^j = aiE[^ij], 

assuming conditions (Al)-(A6) in [14j are satisfied. In fact, we will only need to use the more general 
assumption described in Remark 4.2 of [13]. Roughly speaking, we require that: 

• The urn process is well-defined (see the definition of tenable in Remark 4.2 of [12 )• Essentially, 
we require that the number of balls remains nonnegative at all times with probability 1. 

• The matrix A satisfies a slight generalization of irreducibility and the initial load is positive on 
a "dominating type." This generalization allows for dummy bins that "count certain events." 
(See Section 3 "Limits for urns" of |14j.) 

• The vectors have finite second moments. In our application, the ^j's will actually be bounded. 

We refer the reader to [13] for more details. Under these conditions, it is not hard to see that A 
has a unique largest positive eigenvalue Ai with corresponding positive left eigenvector vi and right 
eigenvector ui (apply the Perron- Probenius theorem lo A + al for an appropriate a). We choose 
ui,vi to satisfy a - v\ = 1 and ui ■ vi = \ where a is the vector of activities. The following theorem 
characterizes the vector Xn- 
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Theorem 2 (Limit of Finite Urns [IJ; Theorem 3.21 in |14j ) Assume conditions (A1)-(A6) 
of 114^ are satisfied. Conditioned on essential non- extinction (see \14^ ) we have 

> Xivi, 

n 

almost surely as n ^ +00. 

In our applications of Theorem[2l it will be easy to establish that "essential extinction" is not possible. 



3 Preferential Attachment: Finite Distributions 

In this section, we treat the case J- = {fj}j^j where J is finite — which we sometimes refer to as 
the finite-type case. This will form the basic step in the analysis of the countable and continuous 
cases. Without loss of generality, we take {fj}jQj increasing. We analyze separately the distribution 
of degrees within each fitness value (Section 13. ip and the distribution of links across fitness values 
(Section 13. 2p . We then combine the two results in Section [3.31 Note that, as we describe below, only 
the first-mover-advantage and fit-get-richer behaviors arise in the finite-type case. 

3.1 Flat Fitness Distributions: First-Mover- Advantage 

Suppose first that J = 1. This is the standard preferential attachment model, which is well under- 
stood (see e.g. [20] and references therein). We rederive the degree distribution by first mapping to 
a Polya urn process and then applying Theorem [2j The mapping is illustrative of our technique. Let 
Ln,k be the number of vertices of degree k at time n; set //i = | and, for k > 2, 

2 -A- / - 1 4 , 



3^^M + 2 k{k + l){k + 2) 

In particular, {fJ,k}k>i is a power law with tail exponent 2. 
Proposition 1 (1-Fitness Case; see e.g. |20] ) For all k > 1, 

Ln,k 



n 

almost surely as n ^ +00. 



Proof: Fix k >1 and consider the following urn process with k + 1 urns of equal activities a.j = 1, 
for all 1 < i < /c + 1. We will design the process in such a way that the number of balls in urn 
i at time n represents the number of edges in the graph which are adjacent to vertices of degree 
i — counting twice edges with both endpoints at vertices of degree i. Except for the {k + l)-st urn, 
where the number of balls will represent the number of edges adjacent to vertices of degree >k + l. 

Let Xq = (0, 2, 0, ... , 0) refiecting the fact that initially there is a single vertex with a self loop 
(degree 2). For 2 < i < /c, let the update vector be deterministic with 



reflecting the fact that, if the new vertex being added to the graph links to an old vertex of degree 
i, then the degree of that vertex becomes i + 1, therefore the edges adjacent to that vertex must be 



' 1, 


i = 1 




3 = i 




j = i + l 


I 0, 


o.w. 
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accounted for in the urn i + 1 instead of the urn i. Finally, for urns i = 1 and i = k + 1, the following 
update vectors respect the boundary conditions 




ei,, = ^ I [ J and 6+1,,= 

It is not hard to see that the urn process described above can be coupled with the preferential 
attachment process so that with probability 1 the following relations are satisfied, for all n > 0, 

Xn/ = iLnl, iovl<e<k 
Xn,k+1 = Yle>k+1 ^^n,l 

The proof is concluded by computing matrix A, its largest eigenvalue Ai and the corresponding left 
eigenvector vi (see Appendix [X]). One can check that Conditions (A1)-(A6) of [11] are satisfied. ■ 

3.2 Competition for Links across Fitness Values 

We now consider the case J = \J-\ > 1 finite. We aim to compute the limiting behavior of the 
random variables Mnj, ^ ^ j ^ J, corresponding to the number of edges with an endpoint of fitness 
fj at time n — counting twice edges with two endpoints of fitness fj, i.e. the total degree of vertices 
of fitness fj. Let Aq > be the largest solution to the equation 

J = l 

where, by monotonicity, Aq G (maxj {fj}, +oo). Also, for 1 < j < J, set 

= ^03-^, (4) 
and verify that j j j 

j=i j=i J j=i -I 

We characterize the distribution of links across fitness values in terms of the i^j's. 

Proposition 2 (Fitness Alone) For all I < j < J, 

Mn,j 

^i' 

n 

almost surely as n ^ +00. 

Proof: We define the following urn process with J urns in which urn i < J has activity ai = fi. The 
urn process will be designed so that the number of balls in urn i corresponds to the number of edges 
with an endpoint of fitness /j. For 1 < i < J, the update vector is given by = Cj + Aj, where 
Aj = Cj with probability qj, for all 1 < j < J. In the context of the preferential attachment process, 
this reflects the fact that, if the new vertex links to a bin of fitness /j, then the number of edges 
with an endpoint of fltness fi increases by one, hence the term e^; moreover, the new vertex picks a 
random fitness according to Q, hence the term A,,. It is easy to couple the defined urn process with 
the preferential attachment one so that, with probability 1, X^^j = Mn,j, for all 1 < j < J and all 
n > 0, provided Xq = 2ei with probability Qi. The proof is concluded by computing matrix A, its 
largest eigenvalue Ai and the corresponding left eigenvector t^i (see Appendix [A]) . ■ 
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3.3 Finite Distributions: Fit-Get-Richer 



In this section, we derive the degree distribution of preferential attachment with fitness under finite 
fitness distributions. For ah 1 < j < J and k > 1, denote by iV„^(j^fc) the number of vertices of fitness 
fj and degree k at time n. Define Aq and as in Section [3.21 Moreover, for aU 1 < j < J and 

/c > 1, set ??(j,fc) as follows 



In particular, 



r?(j-fc) k + lk + l + X^fr^ k 



as k gets large. Thus, for fixed j, {??(j,A;)}fc>i has tail exponent Aq/,- ^■ 

Proposition 3 (Finite Fitness Distributions: Fit-Get-Richer) For all 1 < j < J and k > I, 

we have 



n 

almost surely as n ^ +oo. 



Tl{j,k), 



Observe that the tail exponent is a decreasing function of the fitness. Hence, the tail of the distribu- 
tion gets fatter as the fitness increases. This is the "signature" of the fit-get-richer phase. The proof 
of Proposition [3] is postponed to the appendix. It follows from a combination of the couplings in 
Propositions [J and [21 by defining a Polya urn process with a bin for every pair of fitness and degree. 
Once again, the degree is truncated at a maximum value and an extra bin accounts for all degrees 
above. 

4 Preferential Attachment: Countable Distributions 

If J = -|-oo, which we sometimes call the infinite- type case, the coupling described in the previous 
section cannot be used directly, since it would then require an infinite number of urns (for the 
fitnesses alone) and Theorem [2] is not known to hold generally in the infinite case. Nevertheless, we 
obtain similar results by coupling our process this time with two finite-type preferential attachment 
processes which provide lower and upper bounds on the degree distribution of our process. The 
coupling is presented in Section 14.11 Using this coupling and Proposition [3l we exhibit the following 
evolution scenarios for the preferential attachment process with countable fitness distribution: 

• the fit-get-richer scenario^ taking place when — 

• the innovation-pays- off scenario, taking place when XljS' < 
where h = sup^>^ {fj}- 

For convenience, we treat only the case {/j}j>i increasing. The general case — which is omitted 
from this extended abstract — follows from an analysis similar to that for continuous fitness distribu- 
tions in Appendix [Cl 



4.1 Couphng 

Denoting by h the supremum of {/j}j>i, let us assume that h < +oo; the case h = +oo is treated in 
Section [B. 41 of the appendix. Setting / to be a positive integer, the upper I-truncation of JT, denoted 
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T = {/j}j>i, and the lower I-truncation of denoted £ = {fj}j>i, are defined by 

\ 0, o.w. -i \ h, o.w. 

We shall couple the Q) chain with the chains Q), (£, Q) defined by the upper and lower 
truncations to provide upper and lower bounds respectively on the degrees of chain (T, q|3- Roughly 
speaking, the chains can be coupled so that, at every step, the probability of choosing an old vertex 
of fitness value /i up to fj is larger in the (J^, Q) than in the {J^, Q) chain and larger in the Q) 
than in the (£, Q) chain. This property certainly holds in the beginning of the processes and then 
reproduces itself since it makes the cumulative degree of fitness levels fi up to fj grow faster in the 
(J^, Q) than in the {J-', Q) chain and faster in the {J-, Q) than in the (£, Q) chain. It is important to 
note however that the degree by itself is not sufficient to guarantee the domination of probabilities 
for the next step of the process; rather we couple the edges which get added at each step in such a 
way that the fitness values of the endpoints in chain (£, Q) dominate the fitness values in (J^, Q) 
and those dominate the fitness values in chain {J^, Q). 

Fitness Alone. We first bound Mnj, defined as in Section [3.21 to be the number of edges with an 
endpoint of fitness fj (counting twice edges with two endpoints of fitness fj). Fixing 1 < / < +oo, 
let Mnj and M„ j be the corresponding variables of the {J-', Q), (£, Q) chains. It is clear that the 
latter are equivalent to finite type urn processes, so that Proposition [5] applies. Let Uj and Uj be the 
(almost sure) limits of n^^Mnj and n~^M„ j. Then we have the following. 

Lemma 1 (Coupling: Fitness Alone) For all 1 < j < I , it holds almost surely that 

Mn j _ , . c^n j 

limsup — < Uj, and liminf — > 

n^+oo n n^+oo n ■' 

Proof: Consider the (J^, Q)-chain. At step n > 1, a vertex is picked with probability proportional 
to its degree scaled by its fitness. Let Fn be the fitness of the chosen vertex and denote by Pn-i,i the 
probability that F^ = fi given the state of the chain after step n — 1. After a vertex is picked, a new 
vertex is added with fitness chosen according to Q. Let be the fitness of this new vertex. Denote 
by Fn, F^jJ)^, F_^, F^jj^, the corresponding variables for the chains {J^, Q) and (£, Q) respectively. 
We define a coupling of the three chains so as to preserve the following conditions: 

1. For all n> l,Frr < Fn < Fn and P'n < F^, < F^. 

2. For all n > 1 and ah 1 < i < /, M„ ,- < Mn,i < 'Mn,i- 

3. For all n > 1 and all 1 < i < /, ^ < pn,i < 'Pn,i- 

Note that 3. follows immediately from 1. and 2. We now justify why the conditions are satisfied 
for all n > 0. The initial configuration (n = 0) is constructed by picking an i according to Q and 
choosing the corresponding fitness in all three chains. Therefore the conditions are satisfied at time 
by the definition of and Assuming that Conditions 1., 2., and 3. are satisfied at time n — 1 
we will show that they are true at time n. Indeed, since the fitness of the new vertex is picked 
according to Q in all 3 chains it follows from the definition of and £ that Fn < Fn < F^n- Now 
let us consider the step of picking the old vertex. By 3., it follows that the choices made in the three 
chains can be coupled so as to satisfy Conditions 1. and 2. Indeed, proceed as follows: 

• with probability X]i=i Pi*^^ same fitness in all three chains according to {{p^_-^ j)}i=i'j 

^Strictly speaking, we think of Q here as a distribution on the indices of the fitness sequences and T rather 

than on the fitnesses themselves. 
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• with probabihty Yli=iiPn-i,i — P^_i^^ pick the same fitness in chains {J^,Q) and {J^,Q) 
according to {{pn~i,i — P„„x j)j'^=i some fitness h for (£, Q); 

• with probabihty '}2ii=i^n-i i~Pn~i,i), pick a fitness for the (J^, Q)-chain according to {{Pn-i % — 
Pn-i,i)Yi=i, pick some fitness h for (£, Q), and pick a fitness for Q) according to {(/jM„j)}j>/; 

• note that there is no remaining probabihty mass since ^l=i'Pn-ii = 1- 
This concludes the proof. It should be clear that the described coupling is valid. ■ 

Full Analysis. Using our coupling idea we can also derive bounds on A^,i^(j^fc)i defined as in Sec- 
tion [3i3] to be the number of vertices of fitness fj and degree k at time n in the {J-, Q)-chain, in terms 
of the corresponding variables of the {F, Q)-chain and Q)-chain. The coupling has a similar flavor 
and its details are postponed to Section fB.il of the appendix. 



4.2 Fit-Get-Richer Phase 

Let h = supj>x fj < +00, the case h = +00 being treated in Section lB.41 Unlike the finite-type case, 
when J = +00, we are not guaranteed that there exists a solution of 

E/^ = 1> (6) 
i=i ^ 

with \ > h. Observe, however, that in our proof of Proposition [2] this was necessary for the existence 
of a (summable) Perron- Frobenius eigenvector (see the expression for vi in the proof of Proposition[2]). 
We will actually show that the behavior of the process depends crucially on the existence of such a 
solution. In this section, we consider the case 

We generalize Proposition [3] exhibiting a fit-get-richer behavior in this case. The following theorem 
summarizes our result. 

Theorem 3 (Discrete Case: Fit-Get-Richer Phase) Let 1 < J < +00, h = supj>^ fj < +00. 



J 



Assume \^ /'^'^^ > 1- 

^ h - fi 

j=i 

Then it holds that 

1. for all 1 < j < J + 1, i^j, almost surely as n ^ +00, 

2. for all 1 < j < J + 1 and k > I, ^"■^■'''> — > rK^j^^^, almost surely as n ^ +00, 
where {fj^j and {ri(^j k^}j,k o-re defined by Equations (UD, dSJ. 
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4.3 Innovation-Pays-Off Phase 

Assume that h = supj>]^ fj < +00 and that 

j=i Ji 

It is easy to check that this is possible only if the fitness supremum h is not attained in (see also 
the discussion in Example [1] of the appendix). In particular, it must be that J = +00. Now set 
= ^TT^V; foi" 1 <i i < +00, and note in particular that 

+00 +00 +00 

= - + E/./^ = 1 ^ 2, (9) 

i=i i=i ■'^ i=i ■'^ 

with equality only if there is equality in ([8]11. Also, for all I < j < +00 and k > 1, let v[j k) 
defined as = ^ FP^T • ^ particular, = ^^^^g^ = 1 " + oH)), 

as A: gets large. Hence, for fixed j, {'>l[j k)}k>i has tail exponent hf^^. 

Theorem 4 (Discrete Case: Innovation-Pays-OfF Phase) Let h = supjyi fj < +00. Assume 

Ei^^si. (10) 

j=i 

Then it holds that 

1. For all 1 < j < +00, — > I'j, almost surely as n ^ +00. 

2. For all 1 < j < +00 and k>l, '^"•^•'^'> _> -q'^^. almost surely as n —>■ +00. 

5 Open Problems 

A challenging open problem is to give an exact quantitative description of the dynamics of the 
innovation-pays-off phase. Our results imply that a constant fraction of the links "escapes at infinity." 
But we know little about the transient behavior in this regime. How are the links distributed among 
the highest fitnesses present in the system at any given time? At what rate are new nodes with 
higher fitnesses taking over? How does the transient behavior depend on the fitness distribution? 
This could have important practical implications. 
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A Analysis of Bounded Discrete Fitness Distributions 



Proof of Proposition [Tl We complete the proof of Proposition [T] by computing tlie largest pos- 
itive eigenvalue Ai and the corresponding left eigenvector vi of the matrix A. Because the .^j's are 
deterministic, it follows that Aij = S^ij for all I < i,j < q. To compute Ai we first compute the 
corresponding right eigenvector. Note that 

for all 1 < z < g' and therefore ui is (1,...,1) (up to a constant factor) and Ai = 2. The left 
eigenvector vi must satisfy, 

i=l 

by convention, as well as, 

Q 
i=2 

which with the previous equation implies (vi)i = 1/3. Also, for 2 < / < g — 1, 



or, 

ivi)i-i 1 + 2- 

Therefore, 

Finally, by Theorem [2l we get 

Ln,k _ Xn,k 2{vi)k _ 

n kn k 

almost surely as n — > +oo. ■ 

Proof of Proposition [2} We complete the proof of Proposition [2] by computing the largest positive 
eigenvalue Ai and the corresponding left eigenvector vi of the matrix A which has the following form 

Aij = fiiqj + l{i=j}). 

We compute the corresponding Ai,fi. For all 1 < j < J, vi must satisfy 

J 

<lj^fiivi)i + fj{vi)j = >^l{vi)j. (11) 
i=l 

By the convention 

J 

a-vi = l ^ J2f,{vi)i = l, (12) 
1=1 

it follows that, for all 1 < i < J, 
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Plugging back into p2]) . we get 



Ai - f, 



1. 



Therefore, Ai = Aq and {vi)j = (Ai)~^z>j for all 1 < j < J. The result follows by Theorem[2j ■ 

Proof of Proposition [5} Fix 1 < j < J and A; > 1. Set r = k + 1 and q = rJ. Consider the 
following urn process which is a combination of those in Propositions [1] and [2j We now have a 
bin — indexed — for each fitness fi and each degree / up to k. The number of balls in bin 
at time n is denoted X„ (j ^). The urn process is defined so that X^.^i^i) = IN^^^i^i) (see below). Also, 
for each i, the bin (i, r) counts all the links attached to a vertex of fitness fi and degree more than 
k, that is we have 



X 



n,(i,r) 



l>k+l 



The activity of bin is a^i^i) = fi- Say at step n we pick a ball from bin with 1 < I < r. 
Then, 

1. we choose a fitness, say i' , according to Q; 

2. we add one ball to bin 1); 

3. we remove / balls from bin 

4. we add / + 1 balls to bin (i, / + 1). 

The cases I = l,r are handled similarly (see Proposition [T]). 

We compute matrix A. Let (i, /) be such that 1 < I < r. Then row {i, I) of A is 



A 



For / = 1, we get 



A 



-fil, 

f^il + l) 

fiQi', 
I 0, 

f fii-l + qi 
2f^, 

fiQi', 

I 0, 



i' 

i' = i,r 
I' = 1 

o.w. 



i' = i,l' = 1 

i' = ij' = 2 

i' ^i,r = 1 

o.w. 



and, for I 



A 



{i,r),{i',l') 



fi, i' = ij' = r 

fiQi', ^'=1 
0, o.w. 



We compute the corresponding Xi,ui,vi. Consider the following guess for ui 

f^ 



^0 — fi' 
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for all 1 < i < J and 1 < I < r where Aq is defined in ([3]). Then we have for 1 < i < J and 1 < I < q, 



= fi+ 



(Ao -h + h) 



Ao — fi 

where we used ([3]). Hence, the Perron- Frobenius eigenvalue is Ai = Aq and the corresponding right 
eigenvector is ui as above. 

It remains to compute vi. Define the auxiliary vector 

r 

{vi)i = 

1=1 

for 1 < i < J . Then, by looking at column (i, 1) of we must have 

J 

qi X] - fi{vi)(i^i) = Ai(ui)(i_i), (13) 

i'=i 

for all 1 < i < J. Prom column (i, r) we get 

fi{r{vi){i,r-l) + {Vl)i,r) = Al(ui)(i_^). (14) 

Pinally, for 1 < Z < r, column (i, /) gives 

fi{Kvi){i,i~i) - = >^i{vi)(i,i)- (15) 

Summing (jl3p . (I14p . and (llSp . we obtain 

J 

i'=l 

This is identical to (jlip from Proposition [2] and therefore 



Xi-fi 

for all 1 < i < J. Also, from (jlSp . for 1 < / < r, we get 



By our convention, 

J 

Y fi'{vi)i' = 1, 
i'=i 

we get from p^ . 

From Theorem [21 we derive 

^n,ij,k) _ ^n,{j,k) _^ ^li^l) {j,k) 

n kn k 

almost surely as n — > +00. This concludes the proof. ■ 



r]{j,k), 
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B Analysis of Countable Discrete Fitness Distributions 



B.l Coupling 

We derive bounds on -/V„^(j^fc); defined to be the number of vertices of fitness fj and degree k at time 
n in the (JP", Q)-chain of Section HI Fix / and let N n^(j^k)i^Ln,{j,k) t'e the corresponding variables for 
the chains {J-, Q) and Q) of Section 14.11 defined by the /-truncations of J-. Since the latter have 
finite fitness distributions, we can apply Proposition [3l Let ri[j^k) ^'^^ be the almost sure limits 

of n^^A^„ (j^;,) and n^^N_n,{j,k)- the full coupling, we also need the degree tails for a fixed fitness. 

Let ' ' 

^n,(j,fc) = k'Nn,(j,k'): 
k'>k 

and similarly for r„ (jj.) and T^^^ k)- Also, let 

k'>k 

and similarly for Z{j,k)- These are well-defined because the partial sums are increasing and bounded 
by 2 (see the proof of Proposition [3]). The following lemma provides a full coupling of the chains 
{:F,Q), {T,Q) and (Z,Q). 



Lemma 2 (Coupling: Full Analysis) For all 1 < j < I and k > 1, it holds almost surely that 

limsup — — — <T(^j^k) o.'^d liminf — — — '>I_(^j^k)- 

Proof of Lemma [2} As in Lemma [H we couple the {J^, Q)-chain and the truncations. We use the 
notation of Lemma [TJ Also, for k > 1, let Dn be the degree of the vertex picked at time n in the 
{J^, Q)-chain (and similarly for Dn,]2n)- -^o^ 1 < ^ < and k > 1, let crn-i,(i,k) the probability of 
the event {Fn = fi, Dn > k} given the state after time ?i — 1 in the (J-", Q)-chain (and similarly for 
'^n,nin)- We require the following conditions to be satisfied: 



1. For all n > 1, 

2. For all n > 1 and all 1 < i < /, 

3. For all n > 1 and all 1 < i < /, 

4. For ah n > 1, 1 < i < /, and A; > 1, 



F < F < F 
f' < F' < F' 



P„ . < Pn,i < Pn,i- 



X.n,{i,k) — '^n,{i,k) ^ T^^^i^ky 



5. For ah n > 1, 1 < j < /, and A; > 1, 

^n,(i,fc) < '^n,{i,k) < '^n,{i,k)- 
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These conditions are somewhat redundant but we keep aU of them for clarity. In particular, note that 
3. follows from 1. and 2., that 5. follows from 1. and 4., and that 2. and 3. are special cases of 4. and 
5. Assume these conditions hold up to n— 1. Our step-by-step coupling has two parts. First, we pick 
the fitnesses F„, F„, F^, F^, using the scheme described in the proof of Lemma [H We then 
pick the degrees D^^, Dn^D^ by picking a single uniform random variable in [0,1] and "inverting" 
simultaneously the tails ,fc)}fc>i' {<^n,(F„,fc)}A:>i) and {a^ fc)}fc>i- (This is sometimes called 

the "inverse transform sampling method".) It is easy to check that all conditions are then satisfied 
at time n. ■ 



B.2 Fit-Get-Richer Phase 

Proof of Theorem [3l We only need to consider the case J = +oo. Fix 1 < i < +oo and 
k > 1. Let 1 < / < +00 and consider once again the /-truncations of the Q)-chain. Let 
—h^j''^{jk)^^{jk)^—ljk)^^{jk) ^ ™ Lemmas m [2] (we now indicate the dependence on / because 

we will need to take / — > +oo). Similarly, let Aq and Xq be the largest solution to ([6]) for the lower 
and upper truncations. By the coupling lemmas, it suffices to prove 



A^, aJ ^ Ao, (16) 



as / — > +00. Indeed, in that case 
as / ^ +CXO, which implies 

by Lemma [H Also, for all I < k, 
as I ^ +00, which implies 



Mn,j 

— - 



n 



l<k l<k 

as / — > +00, and similarly for This also holds for A; — 1 so that, by Lemma [21 we have 

' 

almost surely as n ^ +oo. 

It remains to prove ([T6|). We argue about Aq. The proof for Aq is similar and is omitted. Let 

+ 00 » +00 -7 

Note that for X' > X > h, we have 

S{X'),S{X)<j^, SiX')<S{X), 

and 

is(^')-5(A)i< ,j5-„^i::„ . 
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Therefore, S is continuous and strictly decreasing on {X > h} . Also, by definition of J^, we have 

r\\) := S{X) -S\X) = Y: 4^- 



Therefore, for A > /i 

X-h 



h 

i=I+l 



as I ^ +00. Hence, for all e > (small enough), 

lim :S^(Ao + e)= S{Xo + e) < 1, lim s\Xo - e) = S{Xo - s) > 1, 

I^oo I—*oo 

so that eventually 

Xo-e <Xq< Xo + e. 
Since e > is arbitrary, we have p6|) . ■ 

B.3 Innovation-Pays-Off Phase 

Example 1 The case J < +oo always satisfies Indeed, in that case, 

J 



Likewise, when J = +oo and the fitness supremum h is attained, we also get (11) 
Example 2 Consider the case fj = l — for all j > 1 and 



C(2 + e) ' 

where C is the Riemann zeta function. In particular, by definition, X]j>i Qj = 1- Here, h = 1 is not 
attained. We now compute the sum in We have 

C{1 + 9) - C{2 + 9) 
({2 + 9) 

One can check that the last line is < 1 when 9 > 1. This example can be seen as a "discretization" 
of the example given in f^. 

Proof of Theorem [4) We use the notations of Theorem [3l Similarly to Theorem [3l it suffices to 
prove 

XlX^h, (18) 
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as / ^ +00. Let 

= sup fj. 

i<i 

By a remark above the statement of the Theorem, we know that < h and ^ h as I ^ +00. 

We first argue about Aq. Note that Ag > . Also, s\h) < S{h) < 1 and therefore Aq < h. That 
imphes Xq ^ h. 

Now consider the case of Aq. Let 

R^{X) := S{X)-S^{X). 

We have, for all e > 0, 

S{h + e) < S{h) < 1, 

and 

h 

\R^{h + e)\ <- Qi^O, 
^ i=i+i 

as / — > +CXD. Hence, for all e > 0, 

lim Sj{h + e) = S{h + e) < 1, 
/— ►00 

so that eventually 

< Xl < h + e. 

Since e > is arbitrary, we have Aq — > /i as / ^ +00. ■ 
B.4 Unbounded Countable Case 

Assume h = supj>i fj = +00, i.e. the set of fitnesses is unbounded. In that case, the lower bounds 
in the coupling lemmas cannot be used but it turns out that the upper bounds suffice to characterize 
the limit behavior of the process. 

Theorem 5 (Discrete Case: Unbounded Fitness) Assume supj>]^ fj = +00. Then it holds 
that 

1. For all I < j < +00, 



almost surely as n ^ +00 
2. For all I < j < +00 and k > 1, 



almost surely as n — > +00. 

Proof: Fix 1 < j < +00 and k > 1. We use the upper bounds in the coupling Lemmas [1] and [2j 
We use the notations of Theorems [3] and HI We have that — > +00 and therefore Xq > ^ +00. 
Therefore, plugging into the equations for Vj and r^^. k) ~ Sk/c ^^[j S^* 

lim sup < Qj , 

n— >oo n 



n 



n,{j,k) 

n 



0, 
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and 

limsup^^ < 0, 

n--»oo 

almost surely. We get 2. immediately. To get 1., consider the following chain {X„^j}„^j>o. Pick a 
fitness say Fq according to Q and let Xq = epg. Then at each time step, pick a fitness F„, according 
to Q and set Xn = Xn-i + eF„- This chain can clearly be coupled with the {J-', Q)-chain in such a 
way that M„ > Xn for all n. Now it is easy to see that Xnj — > Qj as n — > +oo, and therefore 

lim mf — > Qj . 

n^oo n 

This concludes the proof. ■ 



C Analysis of Continuous Fitness Distributions 

In this section, we analyze the preferential attachment scheme under continuous fitness distributions. 
Let h < +00 — the unbounded case is treated in Appendix IC.4I — and let g : [0, h] — > M+ be a 
continuous density function. Consider the preferential attachment process with = [0, h] and Q the 
distribution defined by g. The dynamical behavior parallels the one observed in the discrete case, 
namely 

1. the fit-get-richer scenario taking place when f^^ ^h-J > 1, 

2. the innovation-pays-off scenario taking place when jj^ ^h-] < 1. 

The analysis requires a more sophisticated coupling argument than that for the discrete case described 
in Section m 



C.l Coupling 

We discretize the {J^, Q)-chain in the following way, which lets us bound the relevant quantities from 
below only. It will turn out that the lower bound is sufficient for our purposes. Fix 1 < / < -|-oo, an 
integer with e = hj. For 1 < i < /, let 

'> 1 



and 



Qi= g{x)dx. 



Denote Q the distribution over {1,2,... , /} defined by {q'i}i=i- For reasons that will be clear in 
Section rC. 31 we allow Jq g{x)dx < 1. Consider the following finite balls-in-bins process with q = I+l 
bins. The activities are _ 

/i, iii = I + l. 



For the initial load, let 1 < i* < / be picked according to Q and pose 



2, if i = 
0, o.w. 
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The update vectors are defined as follows for 1 < i < I: pick i* according to Q (set i* = +00 with 
probability I — G where G = g{x)dx), let 7i = 1 with probability f./fi and o.w., and set 

r 1, iii' = i*, 

= 7il{j'=j} + S 1, if = / + 1, 7i = 0, 
[ 0, o.w. 

and 

Because this chain is not exactly of the type described in Section [Sj we cannot appeal directly to 
Proposition [2j Therefore, we give a separate analysis here. Let \q> h — ehe a. solution to 



1, 


if i' 


1, 


if i' 


0, 


o.w. 



By monotonicity, it is clear that there is a unique such solution. For 1 < j < /, let 



Dj = Ao 



■J 



and 



Note that 



l + G)Xoh 



Xo-{h- e) 



Ao/ tr^ jri j^i^o-Lj Ao - (/i - e) 



G + (l + G)^ + l 



so that 



We prove the following. 

Lemma 3 (Discretization) For all 1 < j < I + 1, 

— - ^i' 

n 

almost surely as n ^ +00. 

Proof: The matrix A has the following form: for l<i</, 1<J</+1, 



= fi (^ii{j</} + Y^{j=i} + ^i{i=/+i}^ 



(19) 



(1 + G)(1 + - 



^ = 1 + G. (20) 
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and for i = I + 1 

Ai+ij = h {qjl{j<i} + l{j=/+i}) . 

We compute the corresponding Xi,vi. Note that by Theorem [2] and the law of large numbers, it 
is clear that 

i+i 

J2Mvl)^ = l + G. (21) 
1=1 



For all 1 < i < t'l must satisfy 



qj^ai{vi)i + fj^{vi)j = Xi{vi] 
J i 



1=1 J 
By the convention 



^a,{vi)^ = l, (22) 



1=1 



it follows that for all 1 < j < / 
Also for ? = / + 1, we must have 



Ai -/. 



(^i)i + h{vi)i+i = e ((1 + G)Ai ^ - {vi)i+i) + h{vi)i+i = Ai(i;i)/+i, 

1=1 J i 



where we have used (j2ip . Therefore, 

_ (i + G)Ar^£ 

- Xi-{h-e)- 

Plugging back into (122]) . we get 



Therefore, Ai = Aq and {vi)j = (Ai) ^Uj for all I < j < q. The result follows by Theorem [2j ■ 

Consider again the Q)-chain. For n > and 1 < j < /, let Mnj be the number of edges with 
an endpoint of fitness in {f f j) (counting twice edges with two endpoints of fitness in ifj,fj))- 
Then we have the following. 

Lemma 4 (Coupling: Continuous Case) For all I < j < I, it holds that 

r ■ f Mn,j . . 
iimmf > Ui 



— ^1 ! 



almost surely. 

Proof: This proof is similar to the proof of Lemma [TJ Consider the {J-, Q)-chain. At step n > 1, 
we first pick a vertex according to weighted preferential attachment. Let Fn be the fitness of the 
chosen vertex, and denote pn-i,i the probability that Fn G if-,fi) given the state after time re — 1. 
Secondly, we add a new vertex with fitness according to Q. Let F^ be the fitness of this new vertex. 
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Similarly for the discretized chain, we first pick a bin i by weighted preferential attachment and then 
an i* according to Q. We also pick 7^ a Bernoulli(/y/j). We let 

p ^jli^ if 7^ = 1, . F'=l^i*^ ifi*</, 

\ h, if 7, = 0. \ +00, if i* = +00, 

We denote ^ . the probability that = /j and 7^ = 1 given the state after time n — 1 1. We 
couple the two chains so as to preserve the following conditions: 



1. For all n > 1, 
and 

2. For ah n > 1 and all 1 < i < /, 

3. For all n > 1 and all 1 < i < /, 



F < F 



F' < F' 



Pn.i ^ P-'i- 



Note that 3. follows easily from 1., 2. and the definition of 7^. In fact, the reason for using the 
"rejection" variable ji is to keep . small by making its numerator small — with a contribution 
of only /. — while preserving a large denominator. Here is how our coupling works. In the initial 
configuration, the {J^, Q)-chain has one vertex with a self- loop and fitness Fq = fi, where fi is picked 
according to Q; the discretized chain can be coupled so that two balls are added to a bin with activity 
F_Q = /j with probability f./fi and F_q = h with probability 1 — f./fi- Therefore the conditions 
are satisfied at time by construction. Assume Conditions 1., 2., and 3. are satisfied at time n — 1; 
we will show then that they are also satisfied at time n. First, consider picking fitness for the new 
vertex. In the {J^, Q)-chain, F^ = fi, where fi is picked according to Q; the choice of the discretized 
chain can be coupled so that = /j. Therefore, F^ < F^. Now consider the step of choosing an old 
vertex. By 3., it is clear how to choose the F's so as to satisfy 1. and 2. Indeed, proceed as follows: 

• With probability p^ ^ ,., pick a bin according to {p^ ^ i^i=i ™ discretized chain, say 
i, and pick a fitness according to weighted preferential attachment restricted to (/., / J for the 
(J^, Q)-chain (the interval is nonempty by 2.); 

• With remaining probability, pick bin / + 1 for the discretized chain, pick an interval according 
to {{pn-i,i — p^_-^ say {f.,fi), and pick a fitness according to weighted preferential 
attachment restricted to {f.,fi) for the {J^, Q)-chain. 

This concludes the proof. ■ 



C.2 Fit-Get-Richer Phase 

Assume the density g is defined on [0, h] with h < +00 and assume further that g{x) > for all 
X £ (0, h) (we allow at the endpoints). In this section, we consider the case 

^ xq(x) , , , 

-^^dx > 1. (23) 
h — X 

The remaining cases are treated in the following two subsections. 



^ The specification that 7^ = 1 is relevant only in the case i = /. 
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Example 3 An important special case of i23\) is when g{h) > 0. Indeed, take any 5 > small and 
let 6' = inf^g[/j_5 g{x). Note that S' > by assumption. Then, 



/i - X "~ ~ Jh^s h-x 

> {h-5)6' / dx 

> {h-5)5' I -dy 

Jo y 

= +00 

> 1. 

This example will turn out to be useful in Section \C. 3[ 

By ()23p and monotonicity, there exists a solution Xq > h to 

"^^ xg{x) 



^0 



An - X 



dx = 1. (24) 



For < a < 6 < /i, let 



^[a,b] = Ao / T dx. 

Note in particular that 

= /\ao - x)-^dx + /' ^^dx = 1 + G, (25) 
Jo Ao - X Jo Ao - X 

as one would expect (but see Section [C .31 below) . Also, for n > 0, let M„ be the number of edges 
with an endpoint of fitness in [a, b] (counting twice edges with two endpoints of fitness in [a, b]). 
We prove the following. 

Theorem 6 (Continuous Case: Fit-Get-Richer Phase) Assume g is defined on [0, h] with h < 
+00 and assume further that g{x) > for all x G (0, h) and 

^ xg{x) 
dx > 1. 

h — X 

Then it holds that for all < a < b < h, 



n 

almost surely as n ^ +co. 

Proof: Note that the law of large numbers implies 

n 

almost surely as n ^ +oo, so that by (p5]) it suffices to show that 

hmmf > vub\, 
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almost surely for all < a < 6 < /i. 

Let 1 < / < +00 and consider once again the discretization of the {T ^ Q)-chain. Let i>J be as in 
Lemma m (we now indicate the dependence on I because we will need to take / — > +oo). Similarly, 
let Ag be as in (fT9l) . Fix < a < 6 < /i. Let be the largest subset of {1, ... , 1} such that 



By the coupling lemma, we have 



hminf^^ > Yvl 



ieici ^0 Li 



^Jf^ \i + e-x 
> r\/(-) d.. 

J a+£ Ag 



\n+ e — X 



Since e = l/I goes to as / — > +oo, it suffices to prove 

A^ ^ Ao, (26) 

as / — > +00. 

We first show that Aq > Aq — £• Indeed, assume Ag = Ag — e. Then, the sum in ()19p satisfies 
^ , ^ (l + G)(Ag^)"^. ^ 

h~^i-!!, ~K~ih-E) friH-f 



> 



which proves the claim, by monotonicity. 

Take any Ag > Aq. We show that eventually, Ag < Ag. Let 

X(A)= f'^dx, 



^ xg{x) 
Ao - x 



A — X 



and note that T{\q) < 1. From ([25]) . we get 



I Y^l I ^7 / flql 



7o Aq - X Jo Xq- X 
< 6(l + G)(Ao)"^+X(Ag). 

As for the other term in (j20p . note that as soon as 

Ao > Ao(l + (1 + G)(Ao)-^V^) > h{l + (1 + G)(Xo)-'V^) - e, 
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(the second inequality is always true), we have 



Xo- {h- e) 



Therefore 



E 




{l_+G){Xo)-h 
Xo - {h- e) 



< V^ + e(l + G)(Ao)^^+X(Ao) < 1 



Ao - f 

—3 



for / large enough, which proves the claim by ()20p and monotonicity. Furthermore, since Aq ^ Aq is 
arbitrary, we have (j26p . This concludes the proof. ■ 

C.3 Innovation-Pays-Off Phase 

Assume the density g is defined on [0, h\ with h < +oo and assume further that g{x) > for all 
X E (0, h) (we allow at the endpoints). In this section, we consider the case 




(27) 



We also assume 




(28) 



although this is not necessary. 



Example 4 Consider the case where Q is Beta(a,P). Then it is easy to show that 




where B is the Beta function. Therefore, \2'T\ j is satisfied if (3 > a + 1. This example is a general- 
ization of the example given in 



By (p7|) . there is no solution Xq > h to 




Instead, for < a < 6 < /i, let 




Note in particular that 




Also, for n > 0, let M„ j^j ^,] be the number of edges with an endpoint of fitness in [a,b] (counting 
twice edges with two endpoints of fitness in [a, b]). For ease of notation, we note Mn^x '■= Mn^[x,x]- 
We prove the following. 
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Theorem 7 (Continuous Case: Innovation-Pays-Off Phase) Assume g is defined on [0, h] with 
h < +00 and assume further that g{x) > for all x G (0, h) and 

Jo i^-x 

Then it holds that for all < a < b < h, 

Mn,[a,b] 



n 

almost surely as n ^ +00. Moreover, for < a < h, we have 



Mn,[a,h] 

n 

almost surely as n ^ +00. 



2-z^[o,a], (30) 



Proof: The convergence ()30p follows trivially from (|29p . Also, from the proof of Theorem[6]it follows 
that 

limmf > i/u fci, 

n— >+oo n ^ ' 

almost surely for all < a < 6 < /i (replace Aq with h in the proof). 

To obtain an upper bound, we consider the modified chain with fitness distribution Qg with 



ge[.x) 



g{x), < X < h — e, 
0, X > h — e. 



It is clear that we can couple this modified chain with the original one so that for allO<a<6</i— e 

(Proceed similarly to the proof of Lemma [TJ) Also, from Example El it follows that the modified 
chain is in the Fit-Get-Richer phase which allows to apply Theorem [6] (this is the reason we allowed 
G < 1 in the proof of Theorem ED . Therefore, for all < a < 6 < /i — e, 

n^+oo n Ja Xy — X 



where Aq*^^ > h — e is a solution to 



We claim that Aq"^^ ^ /i as e — > which proves (j29|) . Indeed, note that 



— r^; dx = 1. 

\f - X 



xg{xl ^ xg{x)_ ^ ^ 
h — X ~ Jq h — X 

Therefore, h — e < Aq"^^ < h. This concludes the proof. ■ 
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C.4 Unbounded Case 

The unbounded fitness case also follows easily from the previous proof (see also the proof in the 
discrete case). Therefore, we state the result without proof. 

Theorem 8 (Continuous Case: Unbounded Case) Assume g is defined on [0,+oo). Assume 
further that g{x) > for all x G (0, +oo) and 




Then it holds that for all < a < b < +oo 




almost surely as n 



+00. Moreover, for < a < +oo, we have 




almost surely as n ^ +oo. 
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